background-image: url("data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/welcome_to_rstats_twitter.png") background-position: 50% 0% background-size: 60% class: bottom ## Writing reproducible manuscripts in R [**Shilaan Alzahawi**](http://shilaan.rbind.io) @ Stanford Graduate School of Business Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- ### Do your data sci like it's going to need an alibi <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/reproducibility_court.png" width="100%" style="display: block; margin: auto;" /> Slides at [bit.ly/shilaan-apa](https://bit.ly/shilaan-apa) Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- # Outline -- **What?** 📝 `\(~\)` Reproducible manuscripts -- **Why?** ✅ `\(~\)` Benefits -- **How?** 🛠 `\(~\)` Tutorial `\(~~~~~~\)` pt. 1: An introduction to **R Markdown** `\(~~~~~~\)` pt. 2: An introduction to **papaja** -- `\(~~\)` Example manuscripts at [github.com/shilaan/example-manuscripts](https://github.com/shilaan/example-manuscripts) ??? This repo includes a link to the slides and instructions for installing R, RStudio, and RMarkdown, and for cloning the contents of the repository to your local computer. Once you've cloned the contents of this repo, you'll be able to follow along at your own pace and take a look at three fully reproducible manuscripts generated with R Markdown. --- background-image: url("data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/make-your-own-stats-cartoons/ex_3.png") background-position: 100% 90% background-size: 38% # Introduction 🧊🔨 -- **Statistics** @ Ghent University **Organizational Behavior** @ Stanford GSB 🔎 `\(~\)` Statistical inference & hypothesis testing 🔎 `\(~\)` Open and reproducible science 🔎 `\(~\)` Crowdsourced & big team science --- # The typical workflow When writing a scientific report, the typical workflow is to ... -- 1. Do your analyses (e.g., in `R` or `Python`) -- 2. Copy-paste or otherwise save your graphs and results -- 3. Open a program (e.g., `Microsoft Word`) to communicate the results -- 4. Manually format your results and citations -- ### Discussion questions -- What are common challenges when working in this fashion? What kind of problems could arise? --- class:center, middle <iframe width="1120" height="630" src="https://www.youtube.com/embed/s3JldKoA0zw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- <img src="data:image/png;base64,#http://swcarpentry.github.io/git-novice/fig/phd101212s.png" width="60%" style="display: block; margin: auto;" /> --- # Typical workflow challenges -- - Time-consuming -- - Error-prone (e.g., rounding or transcription errors) -- - Lacks transparency; difficult to reproduce (by others **and** yourself!) -- - Difficult to maintain and update (endless rewriting and reformatting...) -- - Overhead costs of different computing/software environments -- - **Anything else...?** --- background-image: url("data:image/png;base64,#https://upload.wikimedia.org/wikipedia/en/f/ff/SuccessKid.jpg") background-position: 50% 92% background-size: 45% ## An alternative workflow: What? -- - Fuse your code and writing -- - Directly embed results in your report -- - Automatically reflect analytic changes in your documentation -- - Update all your results, figures, and tables automatically -- - Automatic formatting (including citations!) --- background-image: url("data:image/png;base64,#https://raw.githubusercontent.com/allisonhorst/stats-illustrations/master/rstats-artwork/data_cowboy.png") background-position: 90% 40% background-size: 50% ## An alternative workflow: Why? -- Less... -- ⬇️ Error-prone -- ⬇️ Time-consuming -- More... -- ⬆️ Dynamic -- ⬆️ Reproducible -- ⬆️ Transparent --- background-image: url("data:image/png;base64,#https://bookdown.org/yihui/rmarkdown/images/hex-rmarkdown.png") background-position: 50% 90% background-size: 20% ## Our weapon of choice: RMarkdown -- - RMarkdown is an **authoring framework for data science**, designed for reproducibility -- - The same document holds the code and the narrative surrounding the data -- - Results are automatically generated from the code -- - You can use a single R Markdown file to ✓ save and execute code, and ✓ generate high quality reports that can be shared with an audience --- <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/rmarkdown_rockstar.png" width="80%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations): **Get your code, text, and outputs in the same (reproducible) place** --- ## Introduction to RMarkdown -- - Create dynamic analysis documents that combine code, output (incl. figures and tables), and writing -- - Can be used to ✓ Reproduce your analyses ✓ Collaborate and share code with others ✓ Communicate your results with others -- - Output formats include HTML, PDF, Word and... 🤩 Slide shows ([bit.ly/shilaan-apa](https://shilaan-apa.netlify.app)) 🤩 Websites ([shilaan.rbind.io](http://shilaan.rbind.io)) 🤩 Blogs 🤩 Books 🤩 Dashboards 🤩 Interactive documents 🤩 Conference posters 🤩 Manuscripts --- background-image: url("data:image/png;base64,#images/manuscript.png") background-position: 50% 80% background-size: contain ## Sneak peek: the power of RMarkdown --- background-image: url("data:image/png;base64,#images/cites.png") background-position: 50% 70% background-size: contain ## Sneak peek: the power of RMarkdown --- background-image: url("data:image/png;base64,#images/refs.png") background-position: 50% 50% background-size: contain ## Sneak peek: the power of RMarkdown --- ## Discussion question #### Are there good reasons for **not** using **RMarkdown**? -- <svg viewBox="0 0 576 512" style="position:relative;display:inline-block;top:.1em;fill:#035AA6;height:1.5em;" xmlns="http://www.w3.org/2000/svg"> <path d="M569.517 440.013C587.975 472.007 564.806 512 527.94 512H48.054c-36.937 0-59.999-40.055-41.577-71.987L246.423 23.985c18.467-32.009 64.72-31.951 83.154 0l239.94 416.028zM288 354c-25.405 0-46 20.595-46 46s20.595 46 46 46 46-20.595 46-46-20.595-46-46-46zm-43.673-165.346l7.418 136c.347 6.364 5.609 11.346 11.982 11.346h48.546c6.373 0 11.635-4.982 11.982-11.346l7.418-136c.375-6.874-5.098-12.654-11.982-12.654h-63.383c-6.884 0-12.356 5.78-11.981 12.654z"></path></svg> `\(~~\)` Barriers to **collaborating** with others (requires additional tools: **Git/GitHub**) -- <svg viewBox="0 0 576 512" style="position:relative;display:inline-block;top:.1em;fill:#035AA6;height:1.5em;" xmlns="http://www.w3.org/2000/svg"> <path d="M569.517 440.013C587.975 472.007 564.806 512 527.94 512H48.054c-36.937 0-59.999-40.055-41.577-71.987L246.423 23.985c18.467-32.009 64.72-31.951 83.154 0l239.94 416.028zM288 354c-25.405 0-46 20.595-46 46s20.595 46 46 46 46-20.595 46-46-20.595-46-46-46zm-43.673-165.346l7.418 136c.347 6.364 5.609 11.346 11.982 11.346h48.546c6.373 0 11.635-4.982 11.982-11.346l7.418-136c.375-6.874-5.098-12.654-11.982-12.654h-63.383c-6.884 0-12.356 5.78-11.981 12.654z"></path></svg> `\(~~\)` Not the best format for **computationally expensive functions** -- <svg viewBox="0 0 512 512" style="position:relative;display:inline-block;top:.1em;fill:#035AA6;height:1.5em;" xmlns="http://www.w3.org/2000/svg"> <path d="M256 8C119.043 8 8 119.083 8 256c0 136.997 111.043 248 248 248s248-111.003 248-248C504 119.083 392.957 8 256 8zm0 448c-110.532 0-200-89.431-200-200 0-110.495 89.472-200 200-200 110.491 0 200 89.471 200 200 0 110.53-89.431 200-200 200zm107.244-255.2c0 67.052-72.421 68.084-72.421 92.863V300c0 6.627-5.373 12-12 12h-45.647c-6.627 0-12-5.373-12-12v-8.659c0-35.745 27.1-50.034 47.579-61.516 17.561-9.845 28.324-16.541 28.324-29.579 0-17.246-21.999-28.693-39.784-28.693-23.189 0-33.894 10.977-48.942 29.969-4.057 5.12-11.46 6.071-16.666 2.124l-27.824-21.098c-5.107-3.872-6.251-11.066-2.644-16.363C184.846 131.491 214.94 112 261.794 112c49.071 0 101.45 38.304 101.45 88.8zM298 368c0 23.159-18.841 42-42 42s-42-18.841-42-42 18.841-42 42-42 42 18.841 42 42z"></path></svg> `\(~~~\)` Anything else? --- class: inverse, center, middle # Part 1: RMarkdown --- # Getting started with RMarkdown - Install [`R`](https://cran.r-project.org/mirrors.html) - Install [`RStudio`](https://www.rstudio.com/products/rstudio/download/) - Install the `RMarkdown` package - Install `\(\LaTeX\)` (e.g., `TinyTex`) ```r install.packages("rmarkdown") install.packages("tinytex") # for generating PDF output tinytex::install_tinytex() # install TinyTeX ``` <img src="data:image/png;base64,#https://shilaan.rbind.io/post/building-your-website-using-r-blogdown/excited.jpg" width="70%" style="display: block; margin: auto;" /> --- ## Opening a new R Markdown - Create a new R Markdown document from the menu `File -> New File -> R Markdown` <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/new-rmarkdown.gif" width="90%" style="display: block; margin: auto;" /> --- ## Notebook interface - Allows for direct interaction with R (execute code and display results inline) - Makes it easy to test and iterate - Produces a reproducible document with publication-quality output <img src="data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/07a00dd9669405f3cba06ef333db180295466252/7b153/lesson-images/how-2-chunk.png" width="90%" style="display: block; margin: auto;" /> --- ## Three types of content - YAML meta-data / frontmatter (between `---` and `---`) - Text with Markdown formatting - R code <img src="data:image/png;base64,#images/rmarkdown.png" width="95%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Metadata --- background-image: url("data:image/png;base64,#https://cwextensions.com/images/logo-someta.png") background-position: 92% 7% background-size: 10% # YAML metadata The YAML header contains basic metadata and rendering instructions ```yaml --- title: My R Markdown Report author: Shilaan Alzahawi output: pdf_document date: "2021-11-02" --- ``` -- The date will be **dynamically updated** every time we knit the report, with the help of the following line of code (more on **in-line code** later): -- <img src="data:image/png;base64,#images/date.png" width="1037" style="display: block; margin: auto auto auto 0;" /> --- # Preview an RMarkdown <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/preview.gif" width="100%" style="display: block; margin: auto;" /> --- # Rendering a document ✓  ✓ Windows/Linux: `Control + Shift + K` ✓ OS X: `Command + Shift + K` <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/knitting.gif" width="85%" style="display: block; margin: auto;" /> --- <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/rmarkdown_wizards.png" width="100%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations): **Become an RMarkdown knitting wizard** --- ## Output formats  --- ## Output formats  --- ## What's happening behind the scenes?  ☞ The code within the `.Rmd` file is executed and converted into an `.md` file; ☞ The `.md` file is converted to the output format specified in the metadata --- ## What's happening behind the scenes? Knitting an `RMarkdown` file... -- 1. Starts a new R session ✓ No packages or objects loaded -- 2. Sets your working directory to the location of the `RMarkdown` file -- 3. Executes all code chunks from top to bottom -- ### ⚠️ **Make sure to load all R packages you use!** <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/make-your-own-stats-cartoons/ex_4.png" width="45%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- class: inverse, center, middle # Code --- ## Two types of code in RMarkdown 1. A code chunk, surrounded by three backticks and `{r}` 2. An inline code expression, surrounded by one backtick and `r` <img src="data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/4c3760f9341ec07761c95fb5f03e033fa73d206d/057ff/lesson-images/inline-1-heat.png" width="95%" style="display: block; margin: auto auto auto 0;" /> --- ## Code chunks -- "*Code chunks are the beating heart of our R Markdown.*" [Xie, Dervieux, Riederer 2021](https://bookdown.org/yihui/rmarkdown-cookbook/rmarkdown-anatomy.html) -- ```r summary(Orange) ``` ``` ## Tree age circumference ## 3:7 Min. : 118.0 Min. : 30.0 ## 1:7 1st Qu.: 484.0 1st Qu.: 65.5 ## 5:7 Median :1004.0 Median :115.0 ## 2:7 Mean : 922.1 Mean :115.9 ## 4:7 3rd Qu.:1372.0 3rd Qu.:161.5 ## Max. :1582.0 Max. :214.0 ``` -- ### Inserting a code chunk -- ✓ Windows/Linux: `Control + Alt + I` -- ✓ OS X: `Command + Option + I` -- ✓ Enclosing code with three backticks and `{r}` -- ✓  --- ## Inserting code chunks <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/insert-rchunk.gif" width="95%" style="display: block; margin: auto;" /> --- ## Chunk anatomy <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/chunk-anatomy-2.gif" width="80%" style="display: block; margin: auto;" /> --- ## Naming your code chunks It's recommended to name your chunks. This allows you to quickly navigate code, automatically name figures, and troubleshoot errors. <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/chunk-names.gif" width="80%" style="display: block; margin: auto;" /> --- ## Chunk options Control a chunk's behavior by passing additional, comma-separated arguments -- ✓ `echo = TRUE` show code and output (*default*) -- ✓ `echo = FALSE` show output only (hide code) -- ✓ `include = FALSE` do not show output (run code) -- ✓ `eval = FALSE` show code (do not run; no output) -- ✓ `warning = FALSE` removes warning messages -- ✓ `error = FALSE` removes error messages -- ✓ `message = FALSE` removes all messages -- ```r summary(Orange) ``` -- **Bonus question:** What chunk option did I set here? --- ## Chunk options <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/chunk-options.gif" width="100%" style="display: block; margin: auto;" /> Credit for all GIFs goes to [Shannon Pileggi](https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/#rmd) --- ## Chunk execution `Ctrl + Enter` or `Command + Enter` or press  <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/run-chunk.gif" width="100%" style="display: block; margin: auto;" /> --- ## In-line code To insert in-line code, wrap your code in a single backtick. RMarkdown will always - display the results of inline code, but not the code - apply relevant text formatting to the results -- **R Markdown document** <img src="data:image/png;base64,#images/inline.png" width="1867" /> -- **Knitted HTML document** <img src="data:image/png;base64,#images/inline-knitted.png" width="1667" /> --- class: inverse, center, middle # Text --- # Markdown formatting basics  ---  For more formatting options, see the [R Markdown Reference guide](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf?_ga=2.157796986.1542626288.1625161001-1806201684.1624641897) --- ## Tables <img src="data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/09467251a219c3c6b2dae2bf1367e5736a9ef78c/feeea/lesson-images/tables-1-kable.png" width="90%" /> More on **APA tables** in Pt. 2! --- ## R Markdown tips and tricks -- 📦 Load all R packages in the first code chunk -- ⚠️ Do not include `install.packages()` or `setwd()` --  RMarkdown checks your spelling! -- ⛑ `File > Help > Cheatsheets > R Markdown Cheat Sheet` -- 💨 `File > Help > Markdown Quick Reference` -- ### Resources - [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/) - [R Markdown Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/) --- class: inverse, center, middle # Part 2: papaja --- class: center background-image: url("data:image/png;base64,#images/papaja.png") background-position: 50% 60% background-size: 25% # Getting started with papaja **papaja** = **P**reparing **APA** **j**ournal **a**rticles created by [Frederik Aust](https://github.com/crsh/papaja) --- background-image: url("data:image/png;base64,#images/manuscript.png") background-position: 50% 80% background-size: contain ## Sneak peek: APA title page --- ## Sneak peek: APA tables -- <img src="data:image/png;base64,#images/table.png" width="2249" /> -- <img src="data:image/png;base64,#images/table-knit.png" width="50%" style="display: block; margin: auto;" /> --- # Getting started with papaja -- ```r # make sure you've already installed tinytex! install.packages("devtools") devtools::install_github("crsh/papaja@devel") #install papaja ``` -- `File > New File > R Markdown > From Template > APA article` -- <img src="data:image/png;base64,#images/new-apa.png" width="50%" style="display: block; margin: auto;" /> --- class: center background-image: url("data:image/png;base64,#images/cites.png") background-position: 50% 70% background-size: contain # APA citations --- ## Getting started with APA citations -- 1. Download [Zotero](https://www.zotero.org) -- 2. Download the [Better BibTex for Zotero extension](https://retorque.re/zotero-better-bibtex/) -- 3. Install citr: an RStudio Addin to Insert Markdown Citations ▸ citr can directly access your reference database ▸ citr can keep your reference file updated -- ```r devtools::install_github("crsh/citr") ``` --- # Inserting citations -- 1. Create a reference file using a reference manager (e.g., Zotero) -- 2. Supply the reference file in the `---`front matter`---`  -- 3. Insert citations -- ▸ Insert using your citation key  -- ▸ Insert using `Addins > Insert citations`  --- class: center background-image: url("data:image/png;base64,#images/insert-citation.png") background-position: 50% 50% background-size: 85% --- # Inserting citations <img src="data:image/png;base64,#images/citation-table.png" width="2445" /> --- background-image: url("data:image/png;base64,#images/refs.png") background-position: 50% 60% background-size: contain # Inserting citations - You can cite R packages, too! - After loading all packages, run `r_refs()` to create a BibTex file with references to all currently loaded packages --- ### Harnessing the power of meta-data, code, and text -- **R Markdown document** <img src="data:image/png;base64,#images/harness.png" width="2392" /> -- **Knitted APA manuscript** <img src="data:image/png;base64,#images/harness-knit.png" width="1727" /> --- background-image: url("data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/make-your-own-stats-cartoons/ex_1.png") background-position: 50% 90% background-size: 50% ## Statistical output -- <img src="data:image/png;base64,#images/statistics.png" width="2296" /> -- <img src="data:image/png;base64,#images/statistics-knit.png" width="50%" /> --- ## Another look at APA tables -- <img src="data:image/png;base64,#images/table.png" width="2249" /> -- <img src="data:image/png;base64,#images/table-knit.png" width="50%" style="display: block; margin: auto;" /> --- background-image: url("data:image/png;base64,#images/papaja.png") background-position: 90% 80% background-size: 25% # pajaja tips and tricks -- Define a keyboard shortcut for inserting citations ✂︎ `Tools > Addins > Browse Addins > citr > Keyboard Shortcuts` -- ### Helpful resources - The [papaja manual](http://frederikaust.com/papaja_man/) - [Papers](https://github.com/crsh/papaja#papers-written-with-papaja) written with papaja --- class: right <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/blob/master/rstats-artwork/r_first_then.png?raw=true" width="65%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- class: center, middle # Thank you! ❤︎ Slides created with the R package [**xaringan**](https://github.com/yihui/xaringan). **Questions?** Reach out to me at **shilaan@stanford.edu**